An Empirical Study on Chinese Microblog Stance Detection Using Supervised and Semi-supervised Machine Learning Methods

نویسندگان

  • Liran Liu
  • Shi Feng
  • Daling Wang
  • Yifei Zhang
چکیده

Nowadays, more and more people are willing to express their opinions and attitudes in the microblog platform. Stance detection refers to the task that judging whether the author of the text is in favor of or against the given target. Most of the existing literature are for the debates or online conversations, which have adequate context for inferring the authors’ stances. However, for detecting the stance in microblogs, we have to figure out the stance of the author only based on the unique and separate microblog, which sets new obstacles for this task. In this paper, we conduct a comprehensive empirical study on microblog stance detection using supervised and semi-supervised machine learning methods. Different unbalanced data processing strategies and classifiers, such as Linear SVM, Naive Bayes and Random Forest, are compared using NLPCC2016 Stance Detection Evaluation Task dataset. Experiment results show that the method based on ensemble learning and SMOTE2 unbalanced processing with sentiment word features outperforms the best submission result in NLPCC2016 Evaluation Task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Overview of NLPCC Shared Task 4: Stance Detection in Chinese Microblogs

This paper presents the overview of the shared task, stance detection in Chinese microblogs, in NLPCC-ICCPOL 2016. The submitted systems are expected to automatically determine whether the author of a Chinese microblog is in favor of the given target, against the given target, or whether neither inference is likely. Different from regular evaluation tasks on sentiment analysis, the microblog te...

متن کامل

Emotion Detection in Persian Text; A Machine Learning Model

This study aimed to develop a computational model for recognition of emotion in Persian text as a supervised machine learning problem. We considered Pluthchik emotion model as supervised learning criteria and Support Vector Machine (SVM) as baseline classifier. We also used NRC lexicon and contextual features as training data and components of the model. One hundred selected texts including pol...

متن کامل

Composite Kernel Optimization in Semi-Supervised Metric

Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...

متن کامل

Statistical machine learning for data mining and collaborative multimedia retrieval

of thesis entitled: Statistical Machine Learning for Data Mining and Collaborative Multimedia Retrieval Submitted by HOI, Chu Hong (Steven) for the degree of Doctor of Philosophy at The Chinese University of Hong Kong in September 2006 Statistical machine learning techniques have been widely applied in data mining and multimedia information retrieval. While traditional methods, such as supervis...

متن کامل

Semi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk

This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016